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Vertex cover is one of the classical NP-complete problems in theoretical computer science. A vertex 
cover of a graph is a subset of vertices such that for each edge at least one of the two endpoints is 
contained in the subset. When studied on Erdos-Renyi random graphs (with connectivity c) one 
observes a threshold behavior: In the thermodynamic limit the size of the minimal vertex cover is 
independent of the specific graph. Recent analytical studies show that on the phase boundary, for 
small connectivities c < e, the system is replica symmetric, while for larger connectivities replica 
symmetry breaking occurs. This change coincides with a change of the typical running time of 
algorithms from polynomial to exponential. 

To understand the reasons for this behavior and to compare with the analytical results, we nu- 
merically analyze the structure of the solution landscape. For this purpose, we have also developed 
an algorithm, which allows the calculation of the backbone, without the need to enumerate all solu- 
tions. We study exact solutions found with a Branch-and-Bound algorithm as well as configurations 
obtained via a Monte Carlo simulation. 

We analyze the cluster structure of the solution landscape by direct clustering of the states, by 
analyzing the eigenvalue spectrum of correlation matrices and by using a hierarchical clustering 
method. All results are compatible with a change at c = e. For small connectivities, the solutions 
are collected in a finite small number of clusters, while the number of cluster diverges slowly with 
system size for larger connectivities and replica symmetry breaking, but not 1-RSB, occurs. 



I. INTRODUCTION 

In combinatorial optimization problems, one has to 
minimize a certain function over a discrete phase space 
consisting of e.g. 2^ elements. Often for a given realiza- 
tion (which we will also call instance) there is more than 
one point where the function takes the global minimum 
value. All these points are called solutions or ground 
states. In our paper we will deal with the phenomenon 
of clustering: Usually the ground states are not equally 
distributed over the phase space. They cluster in one 
or many groups that are separated by regions where the 
function takes values that are larger than the global min- 
imum. 

Such clustering has already been observed in statisti- 
cal physics when studying spin glasses [3] ■ For the mean- 
field Ising spinglass, also called Sherrington-Kirkpatrick 
(SK) model [3, Parisi has constructed Q, using the 
replica trick [J, |5| , an analytic solution for the free en- 
ergy. This solution exhibits replica-symmetry breaking 
(RSB) , which means that the state space is organized in 
an infinitely nested hierarchy of clusters of states, char- 
acterized by ultrametricity [g. Recently, this solution 
was mathematically proven to be the exact one 0. Also 
in numerical studies the clustering structure of the SK 
model has been observed, e.g. by calculating the distribu- 
tion of overlaps ^3, , when studying the spectrum of 
spin-spin correlation matrices [TlL [12] or when applying 
direct clustering For finite-dimensional spin glasses, 
RSB seems not to be present fully 0, at least not 
in the same way as for the mean-field model, since clus- 
tering has been observed numerically but in a different 
non-ultrametric way p^ . On the other hand, for models 



like Ising ferromagnets it is clear that they do not exhibit 
RSB and all solutions are organized in one cluster. 

The use of such analytical tools from statistical me- 
chanics enabled physicists recently to contribute to the 
analysis of problems that originate in theoretical com- 
puter science. Well known problems of this kind are the 
satisfiability (SAT) problem 16, 17, 18], number par- 
titionin g 11^. 1201 . g raph coloring j2l|, and vertex cover 
|23 I23L I24 I25I l26j] In computer science, one rewrites 
these optimization problems as decision problems, i.e. 
as problems where only the answers "yes" and "no" are 
possible. Here, it means the question "Is there a solu- 
tion where the function takes a value less than a;" ? The 
above mentioned problems belong to the class NP "2?] of 
decision problems. This means that for any given input 
the function can be evaluated easily, i.e. in a time poly- 
nomial in the size of the input (measured e.g. in bits). 
A single suitable input, for which the function takes a 
value less than cc, proves that the answer to the deci- 
sion problem is "yes" . The open question is whether one 
can find a polynomial-time algorithm that for every pos- 
sible instance and value of x either constructs such an 
input or, in case no such input exists, a proof for the 
non-existence, which can be checked in polynomial time. 
For the so called NP-complete problems up to now only 
algorithms with an exponentially growing running time 
in the worst case are known. But these instances are in 
some regions of the instance space exponentially rare so 
one can do better in the typical case. 

What is "typical" cannot be defined uniquely, hence 
one has to study suitable parametrized (usually random) 
ensembles of problem instances. By varying these param- 
eters one often finds phase boundaries which separate re- 
gions where the answer to the decision problem is "yes" 
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resp. "no" with probability one |2^|2^. Analytically, 
the phase diagrams of these problems can be studied us- 
ing some well-know techniques from statistical physics, 
like the replica trick [sol Isij , or the cavity approach |0 . 
But full solutions have not been found in the most cases, 
since the problems from theoretical computer science are 
usually not defined on complete graphs but on diluted 
graphs, which poses additional technical problems. Usu- 
ally, one can only calculate the solution in the case of 
replica symmetry 0, [s^, |^ , or in the case of one-step 
replica symmetry breaking (1-RSB) [TtL IT^ . and look 
for the stability of the solutions. For this reason, the 
relation between the solution and the clustering struc- 
ture is not well established and it is far from being clear 
for most models how the clustering structure looks like. 
However, most statistical physicist believe that the fail- 
ure of replica symmetry (RS) leads indeed to clustering 
[3^ Issf. So far, only few analytical studies of the clus- 
tering properties of classical combinatorial o ptim ization 
problems like SAT have been performed l34|. These 
results depend or may depend on the specific assump- 
tions one makes when applying certain analytical tools 
and when performing approximations. In particular, it 
is unlikely that the clustering of models on dilute graphs 
is exactly the same as it is found for the mean-field SK 
spin glass. So from the physicist point of view, it is quite 
interesting to study the organization of the phase space 
using numerical methods to understand better the mean- 
ing of "complex organization of phase space" for other, 
non-mean-field models, like combinatorial optimization 
problems. It is the aim of this paper, to study numeri- 
cally the clustering properties of one particular problem, 
the vertex-cover problem (see below) using three different 
complementary approaches. 

The study of the solution structure is not only impor- 
tant for physicists, but also of interest for computer sci- 
ence. From an algorithmic point of view, especially the 
ground-state structure seems to play an important role. 
If it consists only of a single cluster, finding a solution 
typically will be easy. In this case one can often construct 
algorithms that quickly detect the promising regions of 
the solution space. On the other hand the appearance 
of clustered ground states is often accompanied by the 
existence of suboptimal local minima of the function to 
be optimized ,32i]. They mislead local algorithms and 
make computation expensive. In this case, the typical 
computation time grows exponentially in the the size of 
the instance. 

Such easy-hard transitions have been observed in many 
optimization problems, first by studying SAT numeri- 
cally m H^. For SAT, the "yes" -phase (which is re- 
ferred to as SAT-phase) is split up into two regions: an 
easy-SAT and a /lard-SAT phase, which have exactly the 
properties described above. For all known algorithms the 
onset of exponential median running times is in the easy 
phase, although during the last years better and better 
heuristics extended the region of instances that can be 
solved typically in polynomial time. However it is an 



open question, whether this phase boundary really gives 
an upper bound for the best heuristics possible. 

In our paper we deal with the minimal vertex-cover 
(VC) problem. We consider random graphs G — (V, E) 
with N vertices j G 1, 2, . . . and |A randomly drawn, 
undirected edges ^ E d V x V , each connecting a 

pair of vertices. In this notation c is the connectivity, i.e. 
the average number of edges each vertex is contained in. 

Let us briefly recall properties of random graphs that 
are relevant to our analysis of the ground-state structure 
j38| |. For c < 1 the typical random graph only consists 
of small trees each with size of 0{1). Additionally there 
is a finite number of components, also with size of 0(1), 
each having one closed loop, e.g. for ^ 00 the frac- 
tion of closed loops approaches zero 's^. For c > 1, 
the finite-size tree-like components and the components 
with loops remain, but there is one additional compo- 
nent which contains 0{N) vertices, the so-called giant 
component. 

Let V' C V he a subset of all vertices. We call a vertex 
V covered ii v G V' , uncovered if v ^ V' . Similarly an 
edge is covered if at least one of its endpoints is covered. 
If all edges of G are covered, then we call V a vertex 
cover Vvc- We denote X = \V'\ and x = X/N. 

For a graph G = {V, E) the minimal VC problem is 
the following optimization problem: Construct a vertex 
cover Vvc-min C y of minimal cardinality and find its 
size ATmin = \Vvc-min\- Usually there are many solutions 
of the same size. The backbone consists of those vertices, 
which appear in all solutions in the same manner, i.e. 
which are always covered or always uncovered. 

Algorithmically, one can solve the minimal vertex- 
cover problem independently for each component of the 
underlying graph. Any combination of the vertex covers 
of the individual components gives a VC for G. For c < 1, 
where no giant component exists, since the different com- 
ponents are independent and of size 0(1), we cannot ex- 
pect a complicated ground-state structure. Furthermore, 
as we show in the appendix, the solution structure for 
trees is always simple. Hence the main emphasis of the 
paper will be on studying the giant component appearing 
for c > 1, since only this component can be responsible 
for a complex ground-state structure. 

The vertex-cover problem on random graphs exhibits 
the threshold phenomenon described above. For mini- 
mum covers, in the limit A'^ 00, one expects that the 
fraction of covered vertices depends only on the con- 
nectivity c, i.e. we have a;niin = ;Cniin(c). For x < x^in{c) 
almost no graphs with connectivity c have a VC of this 
size, on the opposite for x > a;inin(c) such a cover can be 
found with probability 1. 

By applying the replica method 0, one can derive 
analytical results for this phase boundary. In the 
language of statistical physics, we can think of the size of 
a VC as its energy, and of VCs of minimal size as ground 
states. Using a replica symmetric (RS) ansatz one gets 
^(c) ^ I _ 2W{c)+w{cf ^ ^j^g^g ^j.^^ .g ^j^g Lambert-M^- 
function given by c = W{c) exp(W^(c)). By studying the 
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stability of the solution and by comparison with numer- 
ical results, one finds that the RS solution is valid in the 
region c < e (where e « 2,718 is the Euler number). 
This has also been proven rigorously by analyzing a 
specific algorithm, the leaf removal algorithm j41,] which 
we will describe in section Ull 

The RS ansatz assumes that all ground states form 
a single cluster. This assumption seems to be violated 
for c > e, where one can construct analytically solutions 
with smaller fraction x of covered vertices. One can ex- 
tend the calculation by including RSB, here we expect 
that the ground states form clusters that are separated 
by extensive energy barriers. A single level of cluster- 
ing corresponds to one-step replica symmetric breaking 
(1-RSB). If the clusters itself have some hierarchical clus- 
tering structure, i.e. a set of very similar solutions is sub- 
divided in a structured manner in subsets of even more 
similar solutions, n-step-RSB or fuU-RSB (the last being 
the case where n = oo) appears. However, this fuU-RSB 
is not necessarily the same as found in the SK model. 

In our paper we will analyze the ground-state struc- 
ture of VC numerically. Especially we will focus on the 
behavior of the cluster structure around c = e. We have 
studied different definitions of clusters and methods to 
detect nontrivial clustering. They have in common that 
solutions which are very similar to each other are consid- 
ered to be in one cluster. Definitions and details are given 
later on. So far it is not clear to what extend the observa- 
tion of clustering phenomena depends on the definition of 
the clusters applied and which one is the "correct" one to 
describe RS or RSB. Nevertheless, the results presented 
below for the different methods turn out to be compatible 
with onset of clustering at c — e, supporting the previous 
analytical findings. 

The rest of the paper is organized as follows: First we 
will describe in detail algorithms that we used to find 
minimal VCs: Branch-and-Bound, our backbone algo- 
rithm and a Monte Carlo approach. In the main section 
mil we will present three different methods for analyzing 
the ground-state structure and the corresponding results: 
direct clustering, analysis of the eigenvalue spectrum of 
correlation matrices and Ward's algorithm, which is a hi- 
erarchical clustering method. Finally, we will summarize 
and give an outlook. 



II. ALGORITHMS FOR FINDING GROUND 
STATES 

The vertex-cover problem is NP-complete, so all known 
algorithms have a solution time that in the worst case 
grows exponentially with the number of variables. In 
the typical case, algorithms often perform better. This 
behavior seems to be closely related to the cluster struc- 
tures, which we study in this work. For connectivities 
c < e, a minimal VC can be found typically in a time 
polynomial in the number of vertices using the leaf re- 
moval algorithm, which is explained below. Thus the 



problem is typically easy in this region of connectivity. 
On the contrary there is no similar algorithm known for 
random graphs with c > e. So the point c = e is also 
interesting from an algorithmic point of view. 

We use two different methods to generate VCs. Both 
will be explained in this section: 

1. exact enumeration of all ground states 

2. sampling the structure of close-to- minimum covers 
with Monte Carlo methods 



A. Exact enumeration 

The ideal case to study the cluster structure of the solu- 
tions is to have all solutions available. Since the number 
of solutions grows very fast with system size TV, a direct 
enumeration is not the best choice. We will now explain 
in several steps the algorithms we have used. First we 
always split up the graph into its connected components, 
because they can be treated independently. 

We now explain, how one solution can be found (for 
each component). As a first step, we apply the leaf- 
removal algorithm j4l| , which is a special variant of the 
algorithm by Tarjan and Trojanowski The idea of 
leaf removal is the following: A leaf of the graph is a ver- 
tex i with connectivity one, i.e. it has only one neighbor 
vertex j. In a VC either i or j has to be covered. If 
we covered the leaf i then only the edge between i and j 
would be covered. Thus wc can cover j and so all edges 
originating from j are covered including the one to i. We 
no more need to consider these edges, therefore we re- 
move them from the graph, possibly creating new leaves. 
We iteratively repeat this procedure until the graph is 
empty or no more leaves are present and the so-called 
core remains. These steps take polynomial time in N. 
Bauer and Golinelli show that for c < e this core is com- 
posed of small components of size log(iV), while for larger 
c a complex structure of size 0{N) remains. 

The core has to be treated with a more elaborate 
method, the Branch-and-Bound algorithm. Its basic idea 
is that all possible configurations can be represented as 
a binary tree. At each node the tree splits into two sub- 
trees corresponding to setting one vertex to covered or to 
uncovered. Some of the 2^ leaves correspond to vertex 
covers, some even to minimal vertex covers. The algo- 
rithm starts at the root node, by selecting any vertex. 
The order the vertices are selected is in principle arbi- 
trary. A good performance can be obtained, when e.g. 
selecting the vertices in the order of the current degree, 
i.e. the number of currently uncovered neighbors. Since 
at this point we do not know which vertices have to be 
covered, we have to branch: We set one of the variables 
to one of the two possible values, i.e. we go down one of 
the branches that start at the root node. Iteratively we 
continue this procedure until a full VC has been found. 
Then we go back to an earlier branching point, one calls 
this backtracking, and explore the other subtrees. 



Note, after setting a vertex to covered, new leaves may 
appear in the graph. In this case, we remove them by 
applying leaf removal again. The removed vertices have 
to be inserted again, when backtracking. Furthermore, 
since we are interested only in full covers, we can always 
cover all neighbors of a vertex we have uncovered. 

A significant speed-up can be achieved by bounding. 
The basic idea is to omit subtrees, where for sure no mini- 
mum vertex cover can be found. This can be achieved, by 
keeping track of of the smallest cover X^i^ found so far. 
Let now, at any stage when building the tree, X < Xy^in 
be the current number of covered vertices and let d{j) 
be the current number of uncovered edges that connect 
to the uncovered vertices j. Since we are looking for 
a smaller cover than Xmin, we want to cover at most 
k — Xniin — X — 1 additional vertices. By covering k 
vertices we can reduce the number of uncovered edges at 
most by Af — maxjj^... j,, d{ji) + . . . d{jk)- If the number 
of currently uncovered edges is greater than M we can 
omit this branch of the search tree since it cannot lead 
to a new optimum. Note that the Branch-and-Bound al- 
gorithm takes in the worst case an exponential running 
time. Since the core is only of order Oi^ogN) for c < e, 
this results in a polynomial running time in combination 
with leaf-removal for these values of c. 

Having found a single ground state, in order to enumer- 
ate all solutions, we can now again reduce the size of the 
problem by identifying the vertices that have in all mini- 
mum solutions the same state, the so-called backbone. We 
first consider the covered backbone, i.e. vertices which 
are covered in all minimum solutions. 

Suppose that these solutions have X vertices covered 
and vertex i is in the covered backbone. If we fixed i to 
be uncovered then we could only find vertex covers with 
at least X + 1 covered vertices. This is the idea of our 
backbone algorithm (cf. Fig. ^ : 

1. Select a covered vertex i 

2. For each edge (i, j) add a new vertex rij and a new 
edge {nj,j) to the graph 

3. Remove vertex i and all its edges from the graph 

4. Find the ground-state energy (cover size) of the 
new graph G' . Since all vertices rij are now leaves, 
G' has a ground state with all Uj uncovered. If 
X{G') = X{G) then we also have found a ground 
state of G, with vertex i uncovered. Obviously the 
converse is also true. So we have X{G') = X{G) <^ 
i ^ covered backbone 




FIG. 1: Identifying the backbone: A vertex i is fixed to be 
uncovered by replacing it with new vertices, one for each edge 
(note: a leaf of the original graph can never belong to the 
covered backbone) . i belongs to the covered backbone, iff the 
minimal VCs of the new graph are larger than in the original 
one. 



remaining graph often breaks apart into different com- 
ponents which can be treated individually. This speeds 
up the Branch-and-Bound algorithm when we now enu- 
merate all ground states. In Fig.|21we compare the me- 
dian number of ground states of the largest component 
before and after backbone removal, respectively. In both 
cases the number grows exponentially, but the exponent 
is reduced, especially for smaller c. This can be easily 
understood, since e.g. a single component of two ver- 
tices that gets separated from the largest component due 
to the removal of the backbone reduces the number of 
ground states of this component by a factor of two. 




FIG. 2: Median number of ground states of the largest com- 
ponent for different values of c. The circles represent values 
before removal of the backbone, the triangles after the re- 
moval (error bars are at most of symbol size). The smaller c 
the larger is the speedup, which is due to reduced size of the 
largest component. 



Since the backbone vertices are fixed and all adjacent 
edges are covered, we can remove them from the graph 
without changing the number of solutions. Vertices, that 
have only neighbors belonging to the covered backbone 
are uncovered in all solutions, they form the uncovered 
backbone. After the removal of the covered backbone 
they become isolated vertices and can be removed as 
well. Since the backbone size is rather large 12^, the 



For the enumeration of all ground states, we use a 
variant of the Branch-and-Bound algorithm without leaf- 
removal. Also we allow at each node for k — ATmin — X 
additional covered vertices instead oi k — Xmin — X — 1, 
where Xmin is the size of a minimal vertex cover for the 
current component, which is known from the first step. 
We store all covers which have the size Xmin- When the 
algorithm terminates, all minimum covers are stored. 
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To summarize, the general outline of the algorithm is 
as follows: 

begin 

split the graph into connected components; 

for each component 

begin 

find a single solution using leaf removal 

and Branch-and-Bound; 
determine the backbone vertices; 
remove all backbone vertices; 
split the graph into connected components; 
for each component 
begin 

enumerate all solutions using 
Branch-and-Bound; 

end 
end 
end 

The complete set of all solutions contains all possi- 
ble combinations of covers for all components. Since the 
trees contribute only a trivial background to the solution 
landscape, we analyze in the following very often only 
the largest component of each graph. 

The complete algorithm can find one minimal VC for 
graphs with sizes between N « 200 (for c — 6) and 
N w 2000 (for c = 3). The number of solutions grows 
strongly exponential, so enumeration of all ground states 
is possible only up to N 100. 



begin 

initialize configuration j^!^'! ■ • • {•^i"'} randomly 

for t — 1 . . . Nmc do 

begin 

for each copy k — 1 . . . n do 
do A*' times 

begin {perform one MC step} 
choose a random vertex v 
with probability 1/2 do step (M) or (E) 
(M)if V is covered and has exactly one 
uncovered neighbor w then 

uncover v and cover w 
else do nothing 
(E) if V is uncovered then 

cover it with prop, e"*^' 
else if V and all its neighbors 
are covered then 
uncover v 

end 

for fc = 1 . . . n — 1 

begin {perform PT moves} 

set AE, = (/X, - M.+i) .(E, ' - 

with prop. exp(— min(Ai?i, 0)) 

exchange *"* j^i*'^^''! 

end 
end 
end 

TABLE I: Parallel tempering Monte Carlo algorithm. A^'mc 
denotes the number of MC sweeps per copy, n the number of 
different copies 



B. Monte Carlo algorithm 

For larger graphs, we apply a parallel tempering (PT) 
lillill Monte Carlo (MC) algorithm to sample con- 
figurations. 

The basic approach is that we simulate the behavior of 
an equivalent system, the hard-core lattice gas |24|. The 
graph corresponds to a lattice with edges of length one. 
Each vertex corresponds to a site of the lattice that can 
be occupied by a hard sphere with radius one. The states 
covered/uncovered of the vertices correspond to the two 
possibilities not occupied/occupied in this order. Hence, 
for a given cover U of the graph, we assign an occupation 
number Xi to each site of the lattice: 

{0 if corresponding vertex i , . , 

(1) 
1 if corresponding vertex i ^ U 

The condition, that in a VC for each edge at least one 
of its vertices has to be covered, implies for edges on the 
lattice that at most one of the two endpoints can have 
the occupation number 1. In other words, if a sphere 
is put on site i then all sites that are connected to i by 
an edge cannot be occupied. A given assignment to the 
occupation numbers is thus a VC, if the characteristic 



function 

xis)= n i^-^^^j) (2) 

equals one. We can control the number of spheres by 
applying the grand-canonical formalism. The grand- 
canonical partition function S is given by 

S= ^ exp(^fi^x,'jx{x) (3) 

Xi=0,l 

where fi is the chemical potential. Configurations with 
a larger number of spheres get an exponential greater 
weight. In the large /i-limit the sum is dominated by the 
configurations where the largest number of hard spheres 
is put on the lattice. These configurations correspond to 
minimal VCs. 

The MC moves |3 which sample Eq. ||3Jl consist in 
selecting a vertex randomly and performing with proba- 
bility p = 0.5 either a move (M) or an exchange (E) step. 
For details, see the algorithm in Tab. 

The MC simulation is performed within a PT [i^ 
framework. Its idea is that different copies of the system 
(for the same graph) are simulated each at a different 
value of the chemical potential fj,. At lower values of fj, 
spheres can be more easily removed than at higher ones, 
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so the system can overcome larger energy barriers. At 
high values of n the system equilibrates to a local min- 
imum. The basic PT step is to perform exchanges of 
the configuration for neighboring values of the chemical 
potential in a way such that globally detailed balance is 
ensured, for details, see Tab.O] 




FIG. 3: Average size x of the vertex cover for different values 
of the chemical potential n (N=64000). In the large- limit 
X approaches the average size x^ of the minimal vertex cover 

In our simulation we use n = 26 values between /i = 1 
and /i = 12. The simulations run for ^Vmc = 10^ MC- 
sweeps. 500 configurations are saved for a value /i = 9. 
At this value of /i most of the sampled states have the 
ground state energy, cf. Fig. |3| For the system sizes 
that are tractable by the exact algorithm (cf. beginning 
of subsection) we additionally verified that parallel tem- 
pering really finds ground states. 



III. CLUSTERING METHODS 

A. Clustering using hamming distance 

Our first naive approach for analyzing the ground- 
state structure is based on the hamming distance 
between different solutions. The hamming distance 
rf*siham({^^"^} , {2?^'^^}) = da/3 of two solutions is the 
number vertices in which the two configurations differ. If 
for two optimal solutions their hamming distance is min- 
imal, i.e. distham{{x'-^"^} , {af*-'^-'}) = 2, we will call them 
neighbors. Since for a given realization all ground states 
have the same energy, neighboring states differ only in 
two vertices i and j which are linked by an edge. In 
other words, one can get a neighboring state x'-"^ of a 
given ground state by choosing a covered vertex i which 
has the property that all but one vertex j of the adjacent 
vertices of i are covered. The state with i uncovered and j 
covered is a neighboring ground state of x'"-* . If we think 
of covering marks put on each covered vertex, then this 
is equivalent to moving a covering mark along an edge to 
an adjacent vertex. Step (M) of the MC algorithm above 



exactly corresponds to this move. 

We define a cluster C as maximal set of ground states, 
that can be reached by repeatedly applying the above 
procedure. Similar definitions of clusters have been used 
e.g. for the analysis of random p-XOR-SAT jsj] or finite- 
dimensional spin glasses |43] . States which belong to dif- 
ferent clusters are separated by a hamming distance of 
at least 4. In appendix 1X1 we will show that for a graph, 
which is a tree, the ground-state structure always consists 
of exactly one single cluster. 

To decide that two arbitrary ground states x^") and 
x^^^ do not belong to the same cluster, one needs to calcu- 
late the complete cluster x^"^ (or x^^^) belongs to. Hence 
the clustering is very expensive. 

The naive algorithm is as follows: 

1. identify the giant component of the graph (we ig- 
nore the 0(1) components since they do not influ- 
ence the cluster structure, cf. Sec. 

2. calculate all ground states x'^"^ as described in sec- 
tion |ll| 

3. cluster the ground-state configurations: 
begin 

S:= set of all ground states 

i = {number of so far detected clusters} 

while S not empty do 

begin 

i = i + l 

remove an element x^"^ from S 

set cluster d = (f(")) 

set pointer x^^^^ to first element of Ci 

while f^/^) <> NULL do 

begin 

for all elements x^'''^ of S 

if dham{x^^\x'-^^) = 2then 
begin 

remove x^'''^ from S 
put x'^''^ at the end of Ci 
end 

set pointer af*^^-' to next element of Ci 
or to NU LL if there is no more 
end 
end 
end 

The crucial point is that one really needs to consider 
all ground states and not just a sample. The algorithm 
is quadratic in the number of ground states x^"^ , which 
makes the method applicable to system sizes up to w 
70, depending on the connectivity c. For every value of 
A^ we sampled 10'* realizations. The average number of 
clusters as function of connectivity is shown as circles 
in figure 0] We mainly use this naive method to judge 
the validity of its extension which will be described in 
the next section IIII Bl For c < e the number of clusters 
remains close to one. For larger values of c the number 



of clusters increases with system size. Apparently the 
increase is compatible with a logarithmic growth as a 
function of system size, see discussion in the next section. 
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FIG. 4: Average number of clusters in the solution space of 
the largest component as function of system size. The circle 
symbols for small system sizes have been obtained by clus- 
tering complete sets of ground states. For large systems we 
sampled ground states with a Monte Carlo algorithm at large 
but finite chemical potential jj.. 



B. Cluster detection using sampled states 

In this section we will show, how one can identify clus- 
ters when only a small fraction of the solution space has 
been sampled, as obtained by using Monte Carlo meth- 
ods such as parallel tempering. We start in one of the 
configurations x^"-* and follow a local exchange dynamics 
which does not change the energy, i.e. the size of the 
cover. If we can reach another configuration x*^^) then 
x^") and x^^^ are in the same cluster, according to the 
cluster definition above. 

Let us compare two ground states a;*^"^ and x'-^K We 
do not need to consider vertices that are already covered 
in both states. Let cow*^"-' be the subset of vertices of 
G that are covered in state x^"^ but uncovered in state 
x^^\ In the same way we define cov^^\ Since all ground 
states have the same number of covered vertices both 
sets must have the same size. Moreover, all vertices in 
cov^"^ must be neighbors of vertices in cov^^\ otherwise 
the configurations would not be vertex covers. So the 
subgraph G' of G which contains all vertices in cov^"^ 
and cov'^^'^ and all the edges from G running between 
these vertices is a bipartite graph. 

The following algorithm moves cover marks on the 
graph G" to find out whether x^") and x*^^) belong to 
the same cluster: 

1. select a vertex v in G" which is covered in state x^"^ 
and which has exactly one neighbor w in G" (i.e. w 
is covered in state x^'^^); if no such v exists: stop 






FIG. 5: Can one reach state /3 from state a just by moving 
one cover mark at a time, never uncovering any edge? The 
algorithm tries to find the answer by looking at the bipartite 
graph induced by all nodes that are covered either in a or in 
13 (but not in both) 



2. remove v and w from G", i.e. set G" :~ G'\{v^\{'w} 

3. go to step (1) 

Each pair of vertices taken out in step (2) corresponds 
to moving a covering mark along the edge connecting v 
and w. Since w is always the only uncovered neighbor of 
V the algorithm only visits states that are ground states. 
Note that each covering mark is moved at most once, 
for this reason, we call this procedure "ballistic search" 
|48| . This method has been already applied to study the 
gro und-state structure of finite-dimensional spin glasses 

If the algorithm stops with G" = we have found a 
path in configuration space between states a and /3 that 
only goes through ground states and we know for sure 
that a and /3 are in the same cluster. 

On the other hand, if such a path exists where each 
covering mark has to be moved at most once, then the 
algorithm is guaranteed to find it We prove this 

by contradiction. The main reason is that for two given 
states a and /3 the cover mark on any vertex v is moved to 
the same vertex w in all possible paths, i. e the individual 
moves are unique, only the order in which they are done 
can differ between paths. 

Suppose the opposite would be true, i.e. there exists a 
path P in which the mark on vertex v is moved to vertex 
w and a path P' in which it is moved to vertex w' . Take 
the first vertex u in P for which this is true. The moves 
for all cover marks moved prior to in P are the same in 
P' . So one can do all these moves, afterwards v has only 
one uncovered neighbor, namely w. Next, move all cover 
marks in P', which have not yet been moved. Now w will 
be covered, too. But then vertex v and all its neighbors 
will be covered, which is impossible in a ground state. 
Contradiction, there cannot be two such paths. 

Hence, if the algorithm stops in step (a) with G" 7^ 0, 
then no such path exists where each covering mark is 
moved at most once. This means that either x'"^ and 
x^'^^ are in different clusters or that they are connected 
by a path such that a covering mark has to be moved at 
least twice. To exclude that the clustering is wrong be- 
cause some configurations are connected by paths where 
a clustering mark is moved more than once, we compare 
all configurations pairwise with each other. This means, 
we use the transitivity of the cluster to exclude that two 
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configurations are mistaken to be in different clusters al- 
though they are in the same And indeed we have 
sometimes observed that for three configurations a, /3, 7, 
paths are found from a to /3 and from /3 to 7, but not 
from a to 7. 

For each realization we sample with parallel tempering 
500 configurations during a time of 10^ MC steps as de- 
scribed in section IIIBI This number of configurations is 
far high enough to ensure that never two configurations 
belonging actually to the same cluster are mistaken to be 
in different clusters. On the other hand, it might happen 
that for some (small) clusters no configurations are sam- 
pled. We have tested this explicitly by calculating the 
number of clusters as a function of the number of config- 
urations included in the clustering, see Fig. |^ One can 
see that for small connectivities the number of clusters is 
more or less independent on the size of the sample, while 
for larger values of c and larger system sizes, the number 
of clusters increases slightly with the sample size. This 
means that in Fig. 0] where we show the average num- 
ber of clusters we find for different connectivities (for the 
largest sample size), we have basically a lower bound for 
the real average number of clusters. The main result is 
unaffected by this sample-size effect: For small connectiv- 
ities c the number of clusters is close to one, independent 
of the system size, and for large values of c the number 
of clusters grows. The results are compatible with the 
change appearing near c — e, but we cannot determine 
the value of the change precisely from our data. Also we 
cannot be sure that the growth of the number of clusters 
is only logarithmically with system size. But it seems 
likely that the number of clusters grows slower than ex- 
ponentially with system size, since for c = 4, iV = 800 we 
find on average less than three clusters. Hence, this is 
different from the 1-RSB phase of the satisfiability prob- 
lem 0, 0, . This slow growth is compatible with the 
analytical result that for c > e the 1-RSB solution is not 
the correct one '25^ , hence a higher level of RSB is to be 
expected. 



Extensive eigenvalues and the number of 
clusters 



In this section we will use a completely different 
method to analyze the structure of the solution space. 
Sinova et al. 0, 0| describe a tool for counting inde- 
pendent pure states in Ising spin glasses. Here we sum- 
marize the basic aspects of their method. Their main 
idea is to study the spectral properties of the spin-spin 
correlation matrix (SiSj) = Cy where ( ) indicates the 
thermal average. This matrix is semidefinite and since 
(SiSi) — IVi it has trace N. For spin-glasses above the 
ordering temperature Tc, all eigenvalues are of order one. 
Below Tc, long-range order appears. If there is a single 
pair of pure states, then in the low temperature limit 
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FIG. 6: Behavior of the number of detected clusters depend- 
ing on the number of sampled states for different values of 
c. The crossed symbols are for c = 4, circles for c — 3 and 
small bars for c = 2. For c < e the numbers are confined 
within error bars, for c > e only a fraction of the clusters are 
detected. The number of detected clusters is still increasing 
with the number of sampled states. 



decays to zero with a power-law in N. So one can detect 
the presence of long range order just from analyzing the 
spectrum of C'ij . 

In the frozen, disordered phase, the phase space breaks 
up into many pairs of pure states. They are characterized 
by their clustering property which we will explain in 
more detail in the next subsection IIII Dl Sinova at al. 
argue that the number number of extensive eigenvalues 
of C'ij corresponds to the number of independent pure 
states of the system. This makes it possible to detect 
RSB, which must be present, if the correlation matrix has 
more than one extensive eigenvalue. Note that this way 
of looking at the phase-space structure is different from 
looking at the clusters: The number of clusters may grow 
exponentially with the system size, while the number of 
independent pure states can never be larger than N , since 
a, N X N matrix has only N eigenvalues. 

We apply this method directly to the vertex-cover 
problem. For every realization we calculate Cy aver- 
aged over the configurations sampled by parallel temper- 
ing with S'i = 1 if vertex i is covered and 5^ = — 1 if it 
is uncovered. We calculate the three largest eigenvalues 
and average over 100 to 400 realizations, depending on 
the system size. 

In Fig. we show our results for different values of c 
and N at fi = 9. As one can see in the next section, this 
fi is large enough to allow for a nontrivial behavior. We 
plot the normalized value of the second and third largest 
eigenvalue as a function of system size. As expected for 
c = 1 and c — 2 the system is found to be in the replica 
symmetric phase: There is only one extensive eigenvalue, 
the second and the third decay with a power of N. 

For very large c the behavior is different. The second 
largest eigenvalue reaches a plateau value around N = 
200.. 300. The closer the system is to the c = e the later 
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FIG. 7: Scaling of the second largest (top) and third largest 
(bottom) eigenvalue of dj 



this plateau is reached. Especially for c = 3 the behavior 
is not yet clear from the reachable system sizes. The same 
applies to the third eigenvalue, although one can see a 
difference between the largest and the smallest values of 
c. However, with the reachable system sizes we cannot 
rule out the possibility that the third eigenvalue slowly 
decays for all connectivities. 

Supposing that the behavior of [A2] does not change 
again for large N, we conclude that RSB must be present 
starting from a value of 2 < c < 4 . Please note that we 
cannot distinguish between 1-RSB and higher order of 
RSB from this method. For this reason, we have applied 
another method described in the next section. 



D. Hierarchical Clustering Approach 

In this last subsection, we will use a clustering ap- 
proach that organizes the states in a hierarchical struc- 
ture. Such clustering methods [soj are widely used in 
general data analysis, sometimes also used in statistical 
mechanics, see e.g. Refs. Ll3, 51, 52r The methods all 
start by assuming that all states belong to separate clus- 
ters. Similarity between clusters (and states) is defined 
by a measure called proximity matrix da^p. At each step 
two very similar clusters are joined and so a hierarchical 
tree of clusters is formed. 



A valid hierarchical clustering implies a true ultramet- 
ric structure 0. Such a structure is a very important 
property of the Parisi-RSB solution 0] of the mean-field 
SK- model: All triples {x^"'^ x^^\ x^^^ ) of ground states 
form isosceles triangles with the third side shorter or 
equal to the other two sides. 

We will try to detect a hierarchical structure in the 
phase space of finite-size instances of VC. As proximity 
measure for two initial clusters, each containing only a 
single state, we naturally choose the hamming distance 
between these two states as defined in Sec. lIII Al divided 
by the number of vertices. At each step the two clusters 
Ca and Cf3 with the minimal distance are merged to form 
a new cluster C-y . Then the proximity matrix is updated 
by deleting the distances involving Ca and C/5 and adding 
the distances between C-y and all other clusters Cs in the 
system. So we need to extend the proximity measure to 
clusters with more than one state, based on some suitable 
update rule which is usually a function of the distances 
da.p, da,5 and djs^s. 

The choice of this function is a widely discussed field 
since it can have a great impact on the clustering ob- 
tained |50| . It should represent the natural organization 
present in the data and not some artificial structure in- 
duced from the choice of the update rule. Here we will use 
Ward's method (also called minimum variance method) 
[ssf . The distance between the merged cluster C-y and 
some other cluster Cg is given by 



{ria + ns)da,s + {np + ns)d(3^s ~ [na + nf3)da.p 

"7,<5 ■ ; ] — 

+ + ns 

(4) 

where ria^np^ns are the number of elements in clus- 
ter Ca,Cis, Cs, respectively. Heuristically Ward's method 
seems to outperform other update rules. The choice guar- 
antees that at each step the two clusters to be merged 
are chosen in a way that the variance inside each clus- 
ter summed over all clusters increases by the minimal 
possible amount. 

The output of the clustering algorithm can be repre- 
sented as a dendogram. This is a tree with the ground 
states as leaves and each node representing one of the 
clusters at different levels of hierarchy, see the bottom 
half of the examples in Fig. |S1 

Note that Ward's algorithm is able to cluster any data, 
which can be always displayed as a dendogram, even if 
no structure is presented. Hence, one has to perform ad- 
ditional checks. A visual check is to plot the hamming 
distances as a matrix where the rows and columns are or- 
dered according to the dendogram. This is shown in the 
top half of Fig. |S| Darker colors correspond to smaller 
distances. The figure shows three different realizations: 
For small values of /i no cluster structure is present. For 
small values of c < e and large values of ^, the system is in 
the RS phase, only a single cluster is present. For larger 
values of c and high values of /i, the ordering of the states 
obtained by the clustering algorithm reveals an underly- 
ing structure which can be seen in the right part of the 
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FIG. 8: Sample dendograms of 100 ground states for a graph with 400 vertices. Darker colors correspond to closer distances. 
The left one is at c = 2 and fi — 9, i.e. in the RS phase. There is no structure present. The same is true for c = 6 and 
fj, = 2. For c — 6 and /i = 9 the dendogram provides a structure, where the ground states form clusters. The careful reader 
may recognize a second or third level of clustering in this right picture. 



figure. One can see that the states form groups where the 
hamming distance between the members is small (dark 
colors) while the distance to other states is large. Thus, 
our results are compatible with clustering being present 
for realizations with c > e. If you look carefully you can 
see more structure inside the clusters. Multiple levels of 
clustering indicate higher levels of RSB which we expect 
to be present for these values of c US ■ 

To check more quantitatively whether the cluster 
structure detected by the algorithm is actually present 
in the data we evaluate the cophenetic correlation coeffi- 
cient 



(5) 



where [. . .]g denotes the average over the disorder. This 
coefficient measures the correlation between the original 
distance d of two states and their cophenetic distance dc 
imposed by the clustering, dc is measured on the dendo- 
gram as the distance given by Eq. of the two largest 
clusters that contain only one of the states. 

The results of this test are shown in Fig. |S1 The av- 
erages are over all samples generated with parallel tem- 
pering (cf. Sec. Ill B|) . As one sees, there is no correlation 
for small values of c. This is as expected, because for 
c < e no cluster structure is present. K- increases with 
increasing magnitude of the connectivity. In particular, 
the different curves for N > 100 cross near c = e. For 
small values of c, /C decreases with growing system size, 
while for large values of c, JC increases. This indicates 
again that around c = e a hierarchical organization of 
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FIG. 9: The correlation between hamming distance and 
cophenetic distance measured on the dendogram increases 
with c 



the VC solution space sets in. However, for larger values 
of c the average correlation seems to converge to a value 
close to K, « 0.8. This means that the clustering imposed 
by Ward's method does not fully represent the structure 
inherent to minimal VCs. 
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IV. CONCLUSION 

In our paper we analyzed the ground-state properties 
of the vertex-cover problem. Especially we focussed on 
the phenomenon of clustering. We found that for con- 
nectivities c < e basically only one ground-state clus- 
ter is present. For larger connectivities the number of 
numerically detected clusters increases, apparently log- 
arithmically. This is compatible with the fact that in 
analytical calculations, for c > e, the replica-symmetric 
solution is not longer valid and the level of RSB seems to 
be higher than 1-RSB. More evidence for the appearance 
of RSB was found by analyzing the spectral properties 
of the vertex-vertex correlation function: For c > e two 
or more eigenvalues are extensive which can only be the 
case, if RSB is present. 

With a clustering approach using Ward's algorithm, 
we tried to detect directly a hierarchical structure in the 
ground states. We find qualitatively higher levels of clus- 
tering present in the ground-state structure for high val- 
ues of c. This would indicate higher level of replica sym- 
metry breaking. Also, for c > e, the clustering imposed 
by the algorithms becomes more and more compatible 
with the structuring of the state space. 

In summary, the different algorithms are able to find 
indications for RSB in the solution landscape of combi- 
natorial optimization problems. Note that the presence 
of RSB does not necessarily mean that it is the same type 
of RSB, which is found in the solution of the SK model. 
The details of the organization of the solution space, e.g. 
the extent of ultrametricity, can be different. This can 
be seen in the convergence of the cophenetic correlation 
coefficient to a value apparently smaller than one. 

From our results, which support the previous analyti- 
cal findings, we conclude it seems promising to apply the 
methods to other more complicated ensembles of VC or 
to other optimization problems, where less analytical re- 
sults are available, in order to understand their behavior 
better. 
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APPENDIX A: THE SOLUTION SPACE OF 
VERTEX COVER ON A TREE CONSISTS ONLY 
OF A SINGLE CLUSTER 

In section IIII Al we defined a cluster C as a max- 
imal set of ground states such that for each pair 
^(q)^^(/3) g (J there exists a series {x''^^^)i=o...k of 



ground states with x^^°^ = x^"^ and x^*'-'^ = x^^^ and 
distha7n{x^^'\ x'^^'*^^) — 2, i.e. minimal hamming dis- 
tance between consecutive elements of the series. In this 
appendix we will show, that for trees there can be only 
one cluster. The proof will be by induction on the num- 
ber TV of vertices in the tree. 

For N — 2 there are two ground states, which have 
hamming distance 2. 




FIG. 10: If there is a ground state with vertex v uncovered, 
then all subgraphs must have the same number of covered 
vertices in both states. 

Suppose we have proven the statement for N and con- 
sider a graph of size -I- 1. First note that there is at 
least one ground state x''^^ with a vertex v covered that 
is a neighbor of a leaf vq. Such a ground state can be 
constructed e.g. using leaf removal. We show separately, 
that x'^^^ is in the same cluster as all ground states {x'^^ ^} 
with V covered and with v uncovered. 

Let x'-^ ) be a ground state with v covered. If we delete 
vertex v from the tree, then it falls apart into components 
Gi, . . . Gk where k is the connectivity of v. x'--^'^ induces 
a cover on each Gi which is also a minimal cover on each 
subgraph, since we started with a minimal cover on G. 
The same is true for x^^ ^ . Each of the subgraphs has 
size smaller than N, so by induction we can construct 
a series from x^*^ to x^* ' separately on each subgraph, 
hence both ground states are in the same cluster. 

Now consider a ground state x^^ ^ that has v uncovered. 
Again we consider the subgraphs one gets by removing 
V from the graph. Let Xi be the number of covered ver- 
tices in the cover induced from x'''^ on the subgraph Gi, 
analogues let X'i be the number of covered vertices in 
the cover induced from x'-'' . Since x'^^^ and x'-'' ^ both 
are ground states we have J^i Xi + 1 = J^i -^'i which is 
equivalent to 1 = J^ii-^'i ~ ^i)- summands on the 
right side must be non negative, otherwise x^''' would not 
be a ground state. So there exists exactly one subgraph 
Gj with X'i — Xi = 6i,j. This subgraph must be the leaf 
vo- For i ^ j the covers induced by x^'' ^ on Gi must 
be ground states of the subgraph, since X'i = Xi. So 
by induction we can construct a series from x^^^ to x*-* ' , 
again separately on each subgraph Gi for i ^ j and on 
the subgraph {v} U {vq}, hence both ground states are in 
the same cluster. 

Together we showed that all ground states are in the 
same cluster as x*-*^ , thus there can only be a single clus- 
ter of ground states. 
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